Giveme5W: Main Event Retrieval from News Articles by Extraction of the Five Journalistic W Questions

نویسندگان

  • Felix Hamborg
  • Soeren Lachnit
  • Moritz Schubotz
  • Thomas Hepp
  • Bela Gipp
چکیده

Extraction of event descriptors from news articles is a commonly required task for various tasks, such as clustering related articles, summarization, and news aggregation. Due to the lack of generally usable and publicly available methods optimized for news, many researchers must redundantly implement such methods for their project. Answers to the five journalistic W questions (5Ws) describe the main event of a news article, i.e., who did what, when, where, and why. The main contribution of this paper is Giveme5W, the first open-source, syntax-based 5W extraction system for news articles. The system retrieves an article’s main event by extracting phrases that answer the journalistic 5Ws. In an evaluation with three assessors and 60 articles, we find that the extraction precision of 5W phrases is p = 0.7.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Frame Labeling of Competing Narratives in Journalistic Translation

Studying translations during the time of conflict has gained currency in the recent decade in translation studies. One of the cases in which conflict manifests itself is in the way different countries choose to name an event or a geographical location, for example. This study set out to understand how translation of rival names and labeling was carried out in Iranian state-run news agencies. To...

متن کامل

From Academic to Journalistic Texts: A Qualitative Analysis of the Evaluative Language of Science

This study examined academic articles and journalistic reports in 5 disciplinary areas to explore how similar contents might attitudinally be realized in two different genres. To this end, 25 research articles and 210 news reports were carefully selected and underwent detailed discourse semantic and grammatical analyses with the purpose of identifying the evaluative linguistic patterns....

متن کامل

Rich Interfaces for Browsing News in Blog Posts

Semantic models of news can enable richer interfaces for end-users to learn the context of news events referenced in blog posts. We present Brussell, a system that uses contentspecific models of news event situations to perform anticipatory information retrieval, organize extraction results and present a novel, structured interface for navigating among the events of a news situation. INTRODUCTI...

متن کامل


Novel User Interfaces via Model-Mediated Information Retrieval

Using content-specific models to guide information retrieval can provide richer interfaces to end-users in both navigating news articles and learning the context of news events. We present Brussell, a system that uses semantic models of news event situations to perform anticipatory information retrieval, organize extraction results and present a novel interface for navigating among the mileston...

متن کامل

Event Tracking

This paper introduces Event Tracking, a new application of Information Retrieval technology with interesting research and evaluation questions. We describe the problem, a pilot corpus of news stories that was constructed for experimental studies, and a \rolling" evaluation strategy that uses diierent segments of the corpus for each query. As part of a preliminary evaluation on a small pilot stu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2018